智能论文笔记

Saved You A Click: Automatically Answering Clickbait Titles

Oliver Johnson , Beicheng Lou , Janet Zhong , Andrey Kurenkov

分类：自然语言处理

2022-12-15

Often clickbait articles have a title that is phrased as a question or vague teaser that entices the user to click on the link and read the article to find the explanation. We developed a system that will automatically find the answer or explanation of the clickbait hook from the website text so that the user does not need to read through the text themselves. We fine-tune an extractive question and answering model (RoBERTa) and an abstractive one (T5), using data scraped from the 'StopClickbait' Facebook pages and Reddit's 'SavedYouAClick' subforum. We find that both extractive and abstractive models improve significantly after finetuning. We find that the extractive model performs slightly better according to ROUGE scores, while the abstractive one has a slight edge in terms of BERTscores.

translated by 谷歌翻译

Error-Aware Imitation Learning from Teleoperation Data for Mobile Manipulation

Josiah Wong , Albert Tung , Andrey Kurenkov , Ajay Mandlekar , Li Fei-Fei , Silvio Savarese , Roberto Martín-Martín

分类：机器人 | 人工智能 | 机器学习

2021-12-09

在移动操作（MM）中，机器人可以在内部导航并与其环境进行交互，因此能够完成比仅能够导航或操纵的机器人的更多任务。在这项工作中，我们探讨如何应用模仿学习（IL）来学习MM任务的连续Visuo-Motor策略。许多事先工作表明，IL可以为操作或导航域训练Visuo-Motor策略，但很少有效应用IL到MM域。这样做是挑战的两个原因：在数据方面，当前的接口使得收集高质量的人类示范困难，在学习方面，有限数据培训的政策可能会在部署时遭受协变速转变。为了解决这些问题，我们首先提出了移动操作Roboturk（Momart），这是一种新颖的遥控框架，允许同时导航和操纵移动操纵器，并在现实的模拟厨房设置中收集一类大规模的大规模数据集。然后，我们提出了一个学习错误检测系统来解决通过检测代理处于潜在故障状态时的协变量转变。我们从该数据中培训表演者的IL政策和错误探测器，在专家数据培训时，在多个多级任务中达到超过45％的任务成功率和85％的错误检测成功率。 CodeBase，DataSets，Visualization，以及更多可用的https://sites.google.com/view/il-for-mm/home。

translated by 谷歌翻译

iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks

Chengshu Li , Fei Xia , Roberto Martín-Martín , Michael Lingelbach , Sanjana Srivastava , Bokui Shen , Kent Vainio , Cem Gokmen , Gokul Dharan , Tanish Jain

分类：机器人 | 人工智能 | 计算机视觉 | 机器学习

2021-08-06

最近在体现AI中的研究已经通过使用模拟环境来开发和培训机器人学习方法。然而，使用模拟已经引起了只需要机器人模拟器可以模拟的任务：运动和物理接触的任务。我们呈现IGIBSON 2.0，一个开源仿真环境，通过三个关键创新支持模拟更多样化的家庭任务。首先，IGIBSON 2.0支持对象状态，包括温度，湿度水平，清洁度和切割和切片状态，以涵盖更广泛的任务。其次，IGIBSON 2.0实现了一组谓词逻辑函数，该逻辑函数将模拟器状态映射到烹饪或浸泡等逻辑状态。另外，给定逻辑状态，IGIBSON 2.0可以对满足它的有效物理状态进行示例。此功能可以以最少的努力从用户生成潜在的无限实例。采样机制允许我们的场景在语义有意义的位置中的小对象更密集地填充。第三，IGIBSON 2.0包括虚拟现实（VR）界面，以将人类浸入其场景以收集示威操作。因此，我们可以从这些新型任务中收集人类的示威活动，并使用它们进行模仿学习。我们评估了IGIBSON 2.0的新功能，以实现新的任务的机器人学习，希望能够展示这一新模拟器的潜力来支持体现AI的新研究。 IGIBSON 2.0及其新数据集可在http://svl.stanford.edu/igibson/上公开提供。

translated by 谷歌翻译

Accelerating Barnes-Hut t-SNE Algorithm by Efficient Parallelization on Multi-Core CPUs

Narendra Chaudhary , Alexander Pivovar , Pavel Yakovlev , Andrey Gorshkov , Sanchit Misra

分类：机器学习 | 人工智能

2022-12-22

t-SNE remains one of the most popular embedding techniques for visualizing high-dimensional data. Most standard packages of t-SNE, such as scikit-learn, use the Barnes-Hut t-SNE (BH t-SNE) algorithm for large datasets. However, existing CPU implementations of this algorithm are inefficient. In this work, we accelerate the BH t-SNE on CPUs via cache optimizations, SIMD, parallelizing sequential steps, and improving parallelization of multithreaded steps. Our implementation (Acc-t-SNE) is up to 261x and 4x faster than scikit-learn and the state-of-the-art BH t-SNE implementation from daal4py, respectively, on a 32-core Intel(R) Icelake cloud instance.

translated by 谷歌翻译

Decision-making and control with metasurface-based diffractive neural networks

Jumin Qiu , Tianbao Yu , Lujun Huang , Andrey Miroshnichenko , Shuyuan Xiao

分类：机器学习

2022-12-21

The ultimate goal of artificial intelligence is to mimic the human brain to perform decision-making and control directly from high-dimensional sensory input. All-optical diffractive neural networks provide a promising solution for realizing artificial intelligence with high-speed and low-power consumption. To date, most of the reported diffractive neural networks focus on single or multiple tasks that do not involve interaction with the environment, such as object recognition and image classification, while the networks that can perform decision-making and control, to our knowledge, have not been developed yet. Here, we propose to use deep reinforcement learning to realize diffractive neural networks that enable imitating the human-level capability of decision-making and control. Such networks allow for finding optimal control policies through interaction with the environment and can be readily realized with the dielectric metasurfaces. The superior performances of these networks are verified by engaging three types of classic games, Tic-Tac-Toe, Super Mario Bros., and Car Racing, and achieving the same or even higher levels comparable to human players. Our work represents a solid step of advancement in diffractive neural networks, which promises a fundamental shift from the target-driven control of a pre-designed state for simple recognition or classification tasks to the high-level sensory capability of artificial intelligence. It may find exciting applications in autonomous driving, intelligent robots, and intelligent manufacturing.

translated by 谷歌翻译

Transformers learn in-context by gradient descent

Johannes von Oswald , Eyvind Niklasson , Ettore Randazzo , João Sacramento , Alexander Mordvintsev , Andrey Zhmoginov , Max Vladymyrov

分类：机器学习 | 人工智能 | 自然语言处理

2022-12-15

Transformers have become the state-of-the-art neural network architecture across numerous domains of machine learning. This is partly due to their celebrated ability to transfer and to learn in-context based on few examples. Nevertheless, the mechanisms by which Transformers become in-context learners are not well understood and remain mostly an intuition. Here, we argue that training Transformers on auto-regressive tasks can be closely related to well-known gradient-based meta-learning formulations. We start by providing a simple weight construction that shows the equivalence of data transformations induced by 1) a single linear self-attention layer and by 2) gradient-descent (GD) on a regression loss. Motivated by that construction, we show empirically that when training self-attention-only Transformers on simple regression tasks either the models learned by GD and Transformers show great similarity or, remarkably, the weights found by optimization match the construction. Thus we show how trained Transformers implement gradient descent in their forward pass. This allows us, at least in the domain of regression problems, to mechanistically understand the inner workings of optimized Transformers that learn in-context. Furthermore, we identify how Transformers surpass plain gradient descent by an iterative curvature correction and learn linear models on deep data representations to solve non-linear regression tasks. Finally, we discuss intriguing parallels to a mechanism identified to be crucial for in-context learning termed induction-head (Olsson et al., 2022) and show how it could be understood as a specific case of in-context learning by gradient descent learning within Transformers.

translated by 谷歌翻译

Re-purposing Perceptual Hashing based Client Side Scanning for Physical Surveillance

Ashish Hooda , Andrey Labunets , Tadayoshi Kohno , Earlence Fernandes

分类：计算机视觉

2022-12-08

Content scanning systems employ perceptual hashing algorithms to scan user content for illegal material, such as child pornography or terrorist recruitment flyers. Perceptual hashing algorithms help determine whether two images are visually similar while preserving the privacy of the input images. Several efforts from industry and academia propose to conduct content scanning on client devices such as smartphones due to the impending roll out of end-to-end encryption that will make server-side content scanning difficult. However, these proposals have met with strong criticism because of the potential for the technology to be misused and re-purposed. Our work informs this conversation by experimentally characterizing the potential for one type of misuse -- attackers manipulating the content scanning system to perform physical surveillance on target locations. Our contributions are threefold: (1) we offer a definition of physical surveillance in the context of client-side image scanning systems; (2) we experimentally characterize this risk and create a surveillance algorithm that achieves physical surveillance rates of >40% by poisoning 5% of the perceptual hash database; (3) we experimentally study the trade-off between the robustness of client-side image scanning systems and surveillance, showing that more robust detection of illegal material leads to increased potential for physical surveillance.

translated by 谷歌翻译

Using a Conditional Generative Adversarial Network to Control the Statistical Characteristics of Generated Images for IACT Data Analysis

Julia Dubenskaya , Alexander Kryukov , Andrey Demichev , Stanislav Polyakov , Elizaveta Gres , Anna Vlaskina

分类：机器学习

2022-11-28

Generative adversarial networks are a promising tool for image generation in the astronomy domain. Of particular interest are conditional generative adversarial networks (cGANs), which allow you to divide images into several classes according to the value of some property of the image, and then specify the required class when generating new images. In the case of images from Imaging Atmospheric Cherenkov Telescopes (IACTs), an important property is the total brightness of all image pixels (image size), which is in direct correlation with the energy of primary particles. We used a cGAN technique to generate images similar to whose obtained in the TAIGA-IACT experiment. As a training set, we used a set of two-dimensional images generated using the TAIGA Monte Carlo simulation software. We artificiallly divided the training set into 10 classes, sorting images by size and defining the boundaries of the classes so that the same number of images fall into each class. These classes were used while training our network. The paper shows that for each class, the size distribution of the generated images is close to normal with the mean value located approximately in the middle of the corresponding class. We also show that for the generated images, the total image size distribution obtained by summing the distributions over all classes is close to the original distribution of the training set. The results obtained will be useful for more accurate generation of realistic synthetic images similar to the ones taken by IACTs.

translated by 谷歌翻译

Decentralized Learning with Multi-Headed Distillation

Andrey Zhmoginov , Mark Sandler , Nolan Miller , Gus Kristiansen , Max Vladymyrov

分类：机器学习 | 计算机视觉

2022-11-28

Decentralized learning with private data is a central problem in machine learning. We propose a novel distillation-based decentralized learning technique that allows multiple agents with private non-iid data to learn from each other, without having to share their data, weights or weight updates. Our approach is communication efficient, utilizes an unlabeled public dataset and uses multiple auxiliary heads for each client, greatly improving training efficiency in the case of heterogeneous data. This approach allows individual models to preserve and enhance performance on their private tasks while also dramatically improving their performance on the global aggregated data distribution. We study the effects of data and model architecture heterogeneity and the impact of the underlying communication graph topology on learning efficiency and show that our agents can significantly improve their performance compared to learning in isolation.

translated by 谷歌翻译

Novel structural-scale uncertainty measures and error retention curves: application to multiple sclerosis

Nataliia Molchanova , Vatsal Raina , Andrey Malinin , Francesco La Rosa , Henning Muller , Mark Gales , Cristina Granziera , Mara Graziani , Meritxell Bach Cuadra

分类：计算机视觉

2022-11-09

This paper focuses on the uncertainty estimation of white matter lesions (WML) segmentation in magnetic resonance imaging (MRI). On one side, voxel-scale segmentation errors cause the erroneous delineation of the lesions; on the other side, lesion-scale detection errors lead to wrong lesion counts. Both of these factors are clinically relevant for the assessment of multiple sclerosis patients. This work aims to compare the ability of different voxel- and lesion- scale uncertainty measures to capture errors related to segmentation and lesion detection respectively. Our main contributions are (i) proposing new measures of lesion-scale uncertainty that do not utilise voxel-scale uncertainties; (ii) extending an error retention curves analysis framework for evaluation of lesion-scale uncertainty measures. Our results obtained on the multi-center testing set of 58 patients demonstrate that the proposed lesion-scale measures achieves the best performance among the analysed measures. All code implementations are provided at https://github.com/NataliiaMolch/MS_WML_uncs

translated by 谷歌翻译